Search CORE

65 research outputs found

Toward a generic representation of random variables for machine learning

Author: Donnat Philippe
Marti Gautier
Very Philippe
Publication venue
Publication date: 03/09/2015
Field of study

This paper presents a pre-processing and a distance which improve the performance of machine learning algorithms working on independent and identically distributed stochastic processes. We introduce a novel non-parametric approach to represent random variables which splits apart dependency and distribution without losing any information. We also propound an associated metric leveraging this representation and its statistical estimate. Besides experiments on synthetic datasets, the benefits of our contribution is illustrated through the example of clustering financial time series, for instance prices from the credit default swaps market. Results are available on the website www.datagrapple.com and an IPython Notebook tutorial is available at www.datagrapple.com/Tech for reproducible research.Comment: submitted to Pattern Recognition Letter

arXiv.org e-Print Archive

HAL-Polytechnique

CorrGAN: Sampling Realistic Financial Correlation Matrices Using Generative Adversarial Networks

Author: Marti Gautier
Publication venue
Publication date: 14/12/2019
Field of study

We propose a novel approach for sampling realistic financial correlation matrices. This approach is based on generative adversarial networks. Experiments demonstrate that generative adversarial networks are able to recover most of the known stylized facts about empirical correlation matrices estimated on asset returns. This is the first time such results are documented in the literature. Practical financial applications range from trading strategies enhancement to risk and portfolio stress testing. Such generative models can also help ground empirical finance deeper into science by allowing for falsifiability of statements and more objective comparison of empirical methods

arXiv.org e-Print Archive

Crossref

A proposal of a methodological framework with experimental guidelines to investigate clustering stability on financial time series

Author: Donnat Philippe
Marti Gautier
Nielsen Frank
Very Philippe
Publication venue
Publication date: 17/09/2015
Field of study

We present in this paper an empirical framework motivated by the practitioner point of view on stability. The goal is to both assess clustering validity and yield market insights by providing through the data perturbations we propose a multi-view of the assets' clustering behaviour. The perturbation framework is illustrated on an extensive credit default swap time series database available online at www.datagrapple.com.Comment: Accepted at ICMLA 201

arXiv.org e-Print Archive

Crossref

Clustering Financial Time Series: How Long is Enough?

Author: Andler Sébastien
Donnat Philippe
Marti Gautier
Nielsen Frank
Publication venue
Publication date: 14/04/2016
Field of study

Researchers have used from 30 days to several years of daily returns as source data for clustering financial time series based on their correlations. This paper sets up a statistical framework to study the validity of such practices. We first show that clustering correlated random variables from their observed values is statistically consistent. Then, we also give a first empirical answer to the much debated question: How long should the time series be? If too short, the clusters found can be spurious; if too long, dynamics can be smoothed out.Comment: Accepted at IJCAI 201

arXiv.org e-Print Archive

HAL-ENS-LYON

HAL-Polytechnique